Data Stream Mining with Multiple sliding Windows for continuous Prediction

نویسندگان

  • Omer Mimran
  • Adir Even
چکیده

Data stream mining (DSM) deals with continuous online processing and evaluation of fastaccumulating data, in cases where storing and evaluating large historical datasets is neither feasible nor efficient. This research introduces the Multiple Sliding Windows (MSW) algorithm, and demonstrates its application for a DSM scenario with discrete independent variables and a continuous dependent variable. The MSW development emerged from the need to dynamically allocate computational resources that are shared by many tasks, and predicts the required resources per task. The algorithm was evaluated with a large real-world dataset that reflects resource allocation at Intel's global data servers cloud. The evaluation assesses three MSW treatments: the use of multiple slidingwindows, a novel iterative mechanism for feature selection, and adaptive detection of concept drifts. The evaluation showed positive and significant results in terms of prediction quality and the ability to adapt to swift and/or graduate changes in data stream characteristics. Following the successful evaluation, the adoption of the proposed MSW solution by Intel led to cost savings estimated in millions of dollars annually. While evaluated in a specific context, the generic and modular definition of the MSW permits implementation in other domains that deal with DSM problems of similar nature.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows

Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...

متن کامل

Reducing Data Stream Sliding Windows by Cyclic Tree-Like Histograms

Data reduction is a basic step in a KDD process useful for delivering to successive stages more concise and meaningful data. When mining is applied to data streams, that are continuous data flows, the issue of suitably reducing them is highly interesting, in order to arrange effective approaches requiring multiple scans on data, that, in such a way, may be performed over one or more reduced sli...

متن کامل

Hcluwin: an Algorithm for Clustering Heterogeneous Data Streams over Sliding Windows

Many applications in web usage mining, such as business intelligence and usage characterization, require effective and efficient techniques to discover the users with similar usage patterns and the web pages with correlate contents in the physical world. Clustering click streams can help to achieve the goal. Despite the high processing rate, the existing methods for clustering click streams ove...

متن کامل

Mining Frequent Itemsets for data streams over Weighted Sliding Windows

In this paper, we propose a new framework for data stream mining, called the weighted sliding window model. The proposed model allows the user to specify the number of windows for mining, the size of a window, and the weight for each window. Thus, users can specify a higher weight to a more significant data section, which will make the mining result closer to user’s requirements. Based on the w...

متن کامل

Concept Change Aware Dynamic Sliding Window Based Frequent Itemsets Mining Over Data Streams

Considering the continuity of a data stream, the accessed windows information of a data stream may not be useful as a concept change is effected on further data. In order to support frequent item mining over data stream, the interesting recent concept change of a data stream needs to be identified flexibly. Based on this, an algorithm can be able to identify the range of the further window. A m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014